Chemometric Classification of Mangifera indica L. Leaf Cultivars, Based on Selected Phytochemical Parameters; Implications for Standardization of the Pharmaceutical Raw Materials

Introduction Mangifera indica leaves are among the most common materials employed in manufacturing herbal medicinal products. Despite the phytochemical variation of M. indica cultivars, there are no monographs to guide the cultivation, processing, and authentication of the materials. Methods This study characterized 15 Ugandan M. indica leaf varieties, with reference to extraction index (EI), total phenolic content (TPC), antioxidant activity (AOA), and mangiferin concentration (MC). In addition, HPLC fingerprints were established to evaluate the overall phytoequivalence of the materials. Then, using hierarchical clustering (HC) and principal component analysis (PCA), the materials were assigned quality grades. Results The mean EI was 9.39 ± 1.64% and varied among the varieties (P=0.001); the TPC varied significantly (P < 0.0001), from 183.29 ± 2.36 mg/g (Takataka) to 79.47 ± 0.58 mg/g (Apple mango). AOA ranged from 16.81 ± 2.85 μg/mL (Doodo red) to 87.85 μg/mL (Asante). MC varied significantly (P < 0.0001), from 105.75 ± 0.60 mg/g (Kate) to 39.53 ± 0.30 mg/g (Asante). HC gave four major grades: A to D (A, varieties with the highest TPC, MC, and AOA). These parameters reduced to below average from group B to group D. The chromatographic fingerprints were visually similar, but the number of peaks varied, from 19 (Kawanda green) to 29 (Kawanda wide), with 23.5 ± 2.9 average peaks. Whole fingerprints were less similar (r < 0.8) than common peak fingerprints (r > 0.9, P < 0.001). PCA grouped the fingerprints into five clusters; loading plots for PC 1 and 2 revealed two important compounds, one at Rt = 15.828 minutes (mangiferin) and the other at 6.021 minutes. Using the standardized common fingerprints, unknown field samples clustered closely with Koona, Kate, and Kawanda green varieties. Conclusions The EI, TPC, MC, and AOA values can be utilized to monitor consistency in the quality of materials and the production process. The grades generated can be used to select materials for cultivation and manufacturing. Where minimum concentrations are set, materials of different concentrations are used to dilute or concentrate each other. The HPLC fingerprints can be utilized to authenticate the materials. More samples from different agroecological regions of the country should be tested to cater to climatic variations in order to develop GMP-compliant botanical identification methods.


Introduction
Mangifera indica L. is one of the most common plants in the tropical world. It is primarily grown for its delicious and nutritious fruits [1]. Te fruits are considered a good source of essential amino acids such as valine, methionine, cysteine, and isoleucine; vitamins A and C; minerals including calcium, magnesium, zinc, and iron; carotenoids especially β-carotene; sugars maltose, glucose, and fructose; and dietary fber [2]. Communities all over the world use diferent parts of the M. indica including the stembark, leaves, seed kernels, fruits, roots, and fowers which are used to treat diferent ailments [3]. Furthermore, several studies have demonstrated various bioactivities of M. indica such as antimicrobial, antitumor, antidiabetic, anti-infammatory, antiallergic, and immunomodulatory efects [3,4]. One of the major mechanisms of action of M. indica extracts is through amelioration of oxidative stress, which arises due to the failure of the body to detoxify reactive oxygen species (ROS) and free radicals. Tese compounds react with body components that have electron-rich functional groups such as proteins, lipids, and DNA. Tis leads to changes in their structure and function and development of diseases including diabetes, cardiac damage, renal failure, hepatotoxicity, and cancers [5]. Te generation of ROS is promoted by factors such as ionizing radiation, chemical pollutants, heavy metals, and drugs. Antioxidants work by sacrifcial reaction with ROS to produce neutral unreactive oxygen products and/or by chelating heavy metal ions, to reduce the generation of free radicals [6]. Te most important antioxidants in M. indica extracts include ascorbic acid, carotenoids, and phenolic components. Of equal importance are the minerals: copper, zinc, manganese, and iron, which are cofactors of enzymes relevant in ROS detoxifcation cascade [4]. While carotenoids and vitamin C are high in M. indica fruit, the most important antioxidant phytochemicals in the stembark and leaves are the phenolic compounds [7]. Tese include tannin derivatives such as protocatechuic acid and gallic acid, favonoids such as quercetin, catechin, and kaempferol, and xanthones such as mangiferin. Te antioxidant activities of phenolic compounds have been shown to exhibit secondary protective efects against a number of chronic disorders including carcinogenicity, hepatotoxicity, cardiotoxicity, and diabetes. Among the most studied phenolic compounds from M. indica are mangiferin and its derivatives. Mangiferin exhibits several pharmacological activities such as antimicrobial [8,9] and immunomodulatory [10,11]. In addition, mangiferin also has protective efects on hepatic, cardiac, renal, and brain tissues against induced oxidative stress [12,13] and inhibits carcinogenesis [14,15].
Several studies have indicated that the phenolic content of M. indica leaf extracts varies greatly with the part of the plant [16], variety [17,18], climatic conditions at the cultivation site, and agricultural practices [7]. Terefore, pharmaceutical M. indica raw materials need to be standardized to ensure consistency in quality of the herbal products. In Uganda, M. indica stembark and leaf materials are widely employed in manufacturing products indicated for treatment of respiratory tract disorders including whooping cough, catarrh, sore throat, congestion from asthma, and bronchitis [19]. Various cough syrups containing the plant have been authorized for marketing by the National Drug Authority [19,20].
M. indica materials are sourced either from the wild or from tree fruit plantations. Despite the fact that over 16% of the registered herbal products in Uganda contain materials from M. indica, there are no local monographs to guide the cultivation, processing, identifcation, and chemical characterization of the materials. Still, M. indica materials are not included in the readily available WHO monographs or the West African and African pharmacopoeias [19]. Te development of a botanical identifcation method compliant to current good manufacturing practices (cGMP) requires the establishment of chemical profles of the materials accompanied by chemometric databases [21]. Tis study developed two criteria: (i) grades of leaves based on quantity of selected phytochemical parameters and (ii) similarity of HPLC fngerprints, which can be used to select sources of the M. indica raw materials, authenticate them, and control extraction processes to ensure consistency in the quality of products. Te phytochemical parameters chosen for grading of the Mangifera indica cultivars included extraction index, mangiferin concentration, antioxidant activity, and total phenolic content. Tese parameters were particularly chosen because of their relevance to biological activity of the plant as outlined above. In addition, the use of extractable matter as a quality control method is recommended by WHO especially for plants without a suitable chemical or biological assay method [22].

Study Design.
Tis was an exploratory experimental study to establish the phytochemical relationships among Ugandan M. indica cultivars growing at the National Crops Resources Research Institute (NaCRRI). Tese cultivars were purposely bred in 2007 to increase fruit yield and have since been distributed to farmers countrywide [23]. Tese are Apple mango, Sejjembe, Kawanda green, Kate, Asante, Kawanda wide, Suu, Koona, Kagoogwa, MPI, Takataka, Boribo, Ngoogwe, Bire, and Doodo red [23]. To characterize the varieties, we determined their extractive indices (% yields), total phenolic contents, antioxidant activities, and mangiferin concentrations. We then used these data to classify the cultivars into diferent pharmaceutical raw material grades with the aid of chemometric techniques. Furthermore, HPLC fngerprints were established to evaluate the overall phytoequivalence of the leaf varieties.

Study
where Mr denotes the mass of the dried residue and Ms denotes the mass of the extracted leaf powder [22].

Determination of Total Phenolic Content of the M. indica
Leaf Extracts. Te total phenolic content (TPC) of the extracts was determined using the Folin-Ciocalteu method as applied in [26]. Te sample powders (100 mg) were dissolved in 10 mL of distilled water. A volume of 0.5 mL of the test solutions was transferred into vials, and then 0.5 mL of Folin-Ciocalteu reagent was added. After about 10 minutes, 1.5 mL of 2% (w/v) sodium carbonate solution and 4.5 mL of distilled water were added. Te reaction mixture was incubated in the dark at room temperature (28°C) for 30 minutes. Te TPC of the samples was determined from their visible electromagnetic energy absorbances at 755 nm, in comparison to gallic acid standard solutions using a UV/ visible spectrophotometer (Jenway 6705 UV/Vis, Bibby Scientifc, United Kingdom).

2.7.
Determination of Antioxidant Activity. Te antioxidant activity was determined using the DPPH scavenging activity as applied in [27]. Sample solutions were prepared by dissolving the powdered leaf materials (0.1 g) in 10 mL of 99.9% methanol, by orbital shaking at 200 rpm for 30 minutes. Te solutions were then fltered and made up to 10 mL with methanol. To prepare the standard solutions of ascorbic acid, 0.01 g was dissolved in 10 ml of 99.9% methanol; then, 1 mL of this solution was further diluted to 10 mL with methanol.
To determine the scavenging activity, 20, 30, 40, and 50 μL of ascorbic acid solutions were each added to a solution of 3 mL DPPH (0.0039 mg/mL) and 1 mL methanol and shaken to mix. Te absorbance of each solution was obtained at 517 nm. Methanol was used as the negative control while DPPH (0.0039 mg/mL) solution was the blank. Te activity of the test solutions was determined similarly by adding 20, 30, 40, and 50 μL of each sample solution to a solution containing 3 ml of DPPH and 1 mL of methanol. Te decrease in absorbance of the ascorbic acid and samples was calculated in comparison to a blank sample containing only methanol and the DPPH. Te percentage decrease in absorbance (hereby referred to as the percentage inhibition) was calculated according to the following equation: where Abs blank denotes the absorbance of the blank sample and Abs S denotes the absorbance of either the test solution or standard ascorbic acid solutions. Te % inhibition data were plotted against concentration to determine the amount of the sample and ascorbic that inhibits 50% of the DPPH (IC 50 ), the measure of antioxidant capacity.  [28][29][30]. Te fnal solvent system composed of methanol (31%) and 0.01% orthophosphoric acid (69%). Te optimal mobile phase fow rate was 1 mL/min, while the column temperature was 25°C.

Quantifcation of Mangiferin in the Mangifera indica
Test Solutions. Te concentration of mangiferin was based on standard calibration curves and the corresponding peak areas. Te peaks due to mangiferin were identifed after spiking the samples three times; the average retention times were computed and used to identify the markers in the rest of the samples.

Validation of HPLC Quantitative Methods.
Te accuracy of the method was determined by computing the percentage recovery of mangiferin from three spiked samples. Te intraday repeatability and interday repeatability were obtained from the percentage relative standard deviation of three diferent samples at diferent concentration levels on the same day and after three days, respectively. Te linearity was obtained from the regression equation of the standard calibration curve. Te limit of detection (LOD) and the limit of quantifcation (LOQ) of mangiferin were calculated as 3.3 * SD/slope and 10 * SD/slope, respectively. We assessed peak purity by spiking samples; selective increment in mangiferin r peak areas and heights indicated peak purity.

Data Analysis
Data were captured, stored, and cleaned in Microsoft Excel 2019 ® . Te variation of quantitative data including extraction index, total phenolic content, mangiferin concentration, and antioxidant activity among leaf varieties was analysed by one-way ANOVA, followed by Tukey's multiple comparison tests in GraphPad Prism 9 ® and Minitab 19 ® at the confdence levels of 95%. With classical hierarchical cluster analysis of these data, we generated diferent groups, based on Euclidean distances (PAST 4 ® ); we then used these clusters to group the M. indica leaf materials into pharmaceutical grades corresponding to levels of extraction index, total phenolic content, mangiferin concentration, and antioxidant activity.
Te HPLC fngerprints were qualitatively analysed by visualization and semi-quantitatively analysed by computing similarity indices. Fingerprints with the best resolution of the components indicated by the number of peaks and peak symmetry were identifed. For fngerprint similarity analysis, whole fngerprints (including all peaks) and common fngerprints (with peaks that were common in all samples according to retention time) were identifed. Whole fngerprints were compared visually and by calculating similarity indices based on peak areas. For common fngerprints, the relative retention times of the common peaks in reference to mangiferin were computed. Furthermore, fngerprint peaks were compared by similarity indices (correlation coefcient (r > 0.9) at P � 0.01) and principal component analysis (PCA) using PAST 4 ® software. For PCA, similarity was evaluated by the minimum spanning tree distances in a scatter plot of the frst two components [30,31]. Additionally, loading plots demonstrated the peaks responsible for the variation of fngerprints.

Results and Discussion
Tis work aimed at establishing standards for classifcation of Mangifera indica leaf materials obtained from diferent varieties. To classify the materials, selected parameters relevant to the management of respiratory disorders were quantifed and analysed by chemometric techniques to generate diferent quality grades. M. indica extracts alleviate symptoms of respiratory tract disorders by reducing infammation of the airways and chelating and neutralizing harmful substances, thereby reducing irritation and damage of the respiratory mucosa as well as regulating immune responses. Tese activities have been demonstrated in M. indica extracts rich in phenolic compounds, and particularly mangiferin. Since the antioxidant activity results from the combined efects of many compounds in addition to phenolics, including vitamins, terpenoids, and minerals, it is logical to consider it as a separate phytochemical parameter. In addition to these parameters, HPLC fngerprints were included to give a general picture of the phytochemical variation and to determine if the M. indica leaf varieties are phytoequivalent. Te use of chemometric methods to analyse the data enabled generation of distinct patterns (grades) of M. indica leaves.  Table 1, and in the following, each parameter is explained in detail.

Extraction Index (EI).
Te extraction index determines the non-structural proportion of the drug biomass that is extracted by solvents. Tus, the extractable matter contains primary metabolites including proteins, lipids, and carbohydrates and their building units and secondary metabolites such as waxes, terpenes, gums, resins, phenolics, alkaloids, essential oils, and inorganic compounds [32]. For native extracts (for which no excipients or other substances are added), the extractable matter is also the fnal drug, and thus the extraction index exhibits the efciency of the processing method [33]. Terefore, plant materials with high yields are desirable for proftability of the herbal medicine business. In this study, the EI of M. indica leaf materials was highest for Kawanda wide variety at 12.96 ± 0.60% and lowest for MPI variety at 7.74 ± 1.91% (Table 1). Tese values were similar to those obtained by other researchers [34].
Besides quantifcation of the drug, WHO recommends the use of extractable matter as a quality control method especially for plants without a suitable chemical or biological assay method [22]. As such, the extraction index can be used to monitor consistency in the quality of raw materials and extraction process or monitor the efect of changes in the manufacturing process and plant source (variety or species). Furthermore, the extract strength is also relevant for calculating of the dosages of the individual materials to include in the product formula [33].

Total Phenolic Content (TPC).
Te TPC of M. indica is a summation of phenolic acids, favonoids, and xanthones, the main phytochemicals implicated for antioxidant activity. Terefore, the TPC is an indicator of the quality of materials intended for use as antioxidants or indications based on antioxidant activity: this approach is easier, cheaper, and more relevant than quantifying the individual compounds. Te TPC and/or individual phenolic compounds are known to vary among M. indica cultivars [17,18]. In this study, the TPC of the materials varied signifcantly with the M. indica variety (P < 0.0001), as demonstrated by Tukey's multiple comparisons (Table S1). Takataka variety had the highest content followed by Kagoogwa while Apple mango and Sejjembe varieties had the lowest (Table 1). Te TPC values were similar to those reported earlier [35]. Since there are no established limits for TPC of M. indica materials or products for the treatment of respiratory tract disorders, it is incumbent upon the manufacturer to establish the minimum acceptable TPC of raw materials after establishing its relevance to bioactivity (and indication) of their materials and/or products. Tis can be done by designing dose-response experiments to establish the relationship between the TPC of materials and/ product and the ability to ameliorate the symptoms of the disease.

Antioxidant Activity (AOA) of Mangifera indica Leaves.
Antioxidant activity (AOA) is one of the major biological activities of M. indica extract. Actually, some researchers have postulated that many of the other pharmacological activities of the plant are secondary to its ability to scavenge ROS involved in the pathogenesis of the diseases. As such, brain-protective [36], antidiabetic [37], cardio-protective, anti-infammatory, hepato-protective [38,39], renoprotective, and anticancer activities [40] have been demonstrated. Tis implies that a measure of the AOA of the raw materials is a direct measure of their potency and so ensures pharmacological reproducibility [41]. Tis approach is more appropriate than measuring quantities of individual compounds or groups of compounds (e.g., TPC); besides, it is not cost-efective to determine all the active compounds. In addition, it is more practical for the manufacturer to measure the AOA of the plant materials than determining the therapeutic efect (in this case, several activities relevant to treating respiratory tract disorders) as a quality assurance measure. Given the fact that the composition of antioxidant phytochemicals varies with the variety or plant species, the AOA of M. indica leaf materials is expected to vary with the source cultivar. In this study, it varied signifcantly (P < 0.0001) (

Mangiferin Concentration (MC).
Mangiferin is one of the most studied phenolic compounds of Mangifera indica with several pharmacological activities as outlined in the introduction. For the respiratory tract, mangiferin reduces infammation of the airway, inhibits cytokine production, and protects against lipopolysaccharide-induced allergy [43,44]. Tis makes it a favorable candidate for use as an activity marker [45]. In addition to a diverse biological profle, mangiferin is found in only a few other plant species such as Iris unguicularis, Anemarrhena asphodeloides, Bombax ceiba, Salacia sp., Cyclopia sp., and Crocus sp. [46], which are morphologically distinct from M. indica. Tis makes it ideal as a bioanalytical marker. Besides, the analytical standard is readily available commercially and can be easily isolated in high amounts from several parts of the plant using common solvents like methanol and ethanol. In addition, mangiferin can be quantifed by basic spectroscopic and HPLC methods (the method used in this study). For use as pharmaceutical raw materials, it is only logical that cultivars with high mangiferin concentrations are desirable. In this study, Kate variety had the highest mangiferin concentration, followed by Koona and Ngoogwe while Asante, Apple mango, and Kawanda wide had the lowest (Table 1). Te MC varied signifcantly with M. indica variety (P < 0.0001), as demonstrated by ordinary one-way ANOVA and Tukey's multiple comparisons (Table S3).

Relationships among the Phytochemical Parameters.
Te Pearson correlation analysis of the parameters showed a direct (positive) relationship, as expected, with the correlation between the mangiferin concentration and antioxidant activity being statistically signifcant at α � 0.05 (Table 2).
Te TPC positively correlated with the extraction index, although not signifcantly (r � 0.443, P � 0.098). In addition, the TPC/EI ratio, an indicator of how much of the extracted matter is active drug (where activity due to phenolic compounds is of primary interest), ranged between 8.0 and 18.4 with a mean of 14.2. Koona, Kawanda green, and Takataka varieties had the highest TPC for a unit percentage yield of the extract, while Apple mango, Doodo red, and Sejjembe had the lowest (Table S1). Tis observation can be explained by the fact 70% ethanol extracts a variety of relatively polar compounds besides phenolic compounds, the concentration of which could also vary in diferent varieties. Some of these compounds, like terpenoids and minerals, augment the bioactivity of phenolics and so are desirable [4].
A correlation analysis revealed that the AOA of M. indica materials increases proportionately with increase in the TPC although not statistically signifcant (r � −0.292, P � 0.291). Actually, some samples with high TPC had low antioxidant activity such as Asante, Kawanda green, and Kawanda wide. Tis notwithstanding, most varieties showed a direct relationship between the TPC and AOA, that is, Kate, Koona, Kagoogwa, and Takataka. Some varieties such as Doodo red had comparatively high AOA despite lower TPC (Table 1). Te low correlation between TPC and AOA can be explained by the fact that M. indica contains (i) other non-phenolic antioxidant compounds such as terpenoids, carotenoids, vitamins E and C, and minerals and (ii) phenolic compounds with low or no AOA such as amino acids [7,16]. However, these components are known to concentrate mainly in the fruits. Teir role as antioxidants in other parts of the plant is yet to be elaborated; (iii) another factor is the variation in the concentration of the specifc phenolic compounds with varying activity. Structure activity relationship analysis of diferent phenolic compounds indicates that antioxidant activity is afected by the number of aromatic and hydroxyl groups ( Figure 1) as well as their relative positions in the structure [47], and thus tannins (A), favonoids (B), and xanthones (C) exhibit diferent levels of activity.
Tus a M. indica variety may only produce a low of TPC but contain a high concentration of the compound(s) with high activity and vice versa. In addition, some phenolic compounds might work additively, synergistically, or antagonistically as demonstrated in [48]. While the correlation between the TPC and AOA was low, the AOA/TPC ratio should be established and utilized to monitor the consistency in the composition of the herbal materials. Te AOA/ TPC ranged from 1.3 to 8.9 with an average of 4.2 ( Figure S1).
Tere was a signifcant correlation between the mangiferin concentration (MC) of the Mangifera indica leaves and their antioxidant activity (AOA) (r = −0.567, P � 0.028). Tis shows, as expected, that the AOA increases (reducing IC 50 ) with an increase in the MC. Tis correlation was higher than that seen with TPC and AOA. Te results demonstrate the importance of mangiferin as an antioxidant component of M. indica leaves, qualifying it as a marker for AOA. Since r < 1, this validates the fact that the observed total AOA is a result of synergism among the various phytochemicals. Actually, some samples with relatively lower MC such as Takataka had relatively high AOA; such samples are likely to be rich in non-mangiferin antioxidant compounds. Te AOA/MC ratios ranged from 1.2 to 6.9, with an average value of 2.8 (Table S1). Te manufacturer can set a minimum acceptable ratio depending on the relevance of mangiferin to the application of the materials or products. Generally, there was a positive although not signifcant correlation between the mangiferin concentration (MC) of the M. indica leaves and the total phenolic content (TPC) (r � 0.363, P � 0.184). Samples with the highest MC per TPC were Ngoogwe, Bire, and Kate while Asante, Kawanda wide, and Kawanda green had the lowest. Te MC/TPC ranged from 0.3 to 0.6, with a mean of 0.5. Tese results show that mangiferin is just one of the phenolic compounds in M. indica.

Classifcation of the Mangifera indica Leaf Varieties Based on Extraction Index, Total Phenolic Content, Mangiferin
Concentration, and Antioxidant Activity 4.7.1. Clustering Analysis. Clustering is a multivariate analysis tool that groups samples based on the similarity of the measured parameters. In this study, we used hierarchical clustering to classify Mangifera indica leaf cultivars depending on the variation of four parameters, namely,extraction index, total phenolic content, mangiferin concentration, and antioxidant activity. Te similarities were performed using Ward's method algorithm. Four main groups, A, B, C, and D, were obtained ( Figure 2). Group A contains varieties with the highest total phenolic contents, mangiferin concentrations, and antioxidant activities. Tese parameters reduce from group B to group D, to below average values. Based on these clusters and on the average quantities of the studied parameters, we generated four grades of M. indica leaf materials. Tese are summarized in Table 3.
Te contribution of the parameters to the observed clusters is well illustrated by a PCA scatter biplot (Figure 3).
From Figures 2 and 3, it is clear that the total phenolic content varies most, followed by the mangiferin concentration and antioxidant activity. Te extraction index is the least variable parameter. Tese grades generated can guide manufacturers and botanists to select the best varieties for use as pharmaceutical raw materials. For therapeutic applications for which the studied parameters are relevant,  Evidence-Based Complementary and Alternative Medicine such as respiratory tract disorders (as it is in Uganda), samples with high parameters are preferred. However, the manufacturer might need to determine the minimum amounts of each parameter that provides optimum potency of the product; this was beyond the scope of this work.

Classifcation of the Mangifera indica Leaf Varieties
Based on Fingerprint Characteristics. Te classifcation of Mangifera indica leaf varieties is based on the common fngerprint pattern recognition and multivariate analysis of common peak (common fngerprints) and whole chromatogram peak areas (whole fngerprint). Te typical fngerprint is shown in Figure 4.

Visual Analysis for Pattern Recognition of HPLC
Fingerprints. Visual inspection of the 30 minutes of whole fngerprints showed high similarity ( Figure 5). However, individual fngerprints varied greatly in the number of peaks, from 19 in Kawanda green to 29 in Kawanda wide, with an average of 23.5 ± 2.9 peaks. Te total peak areas ranged from 5,863,448 mVmins for the Kate variety to 1,568,633 mVmins for the Asante variety, with an average area of 2,457,451 ± 1,026,790 mVmins (Table S5). Tese results demonstrate marked phytochemical variability of the M. indica varieties. Tis can be explained by the fact that the genetic makeup of plants determines the nature and amounts of plant metabolites by infuencing the nature and number of enzymes and cofactors produced by a particular cultivar or subspecies [49]. Also, certain cultivation of plants in non-natural habitats may afect their metabolic rate because of unfavorable climatic conditions in the new environments [50]. Tus, a project whose aim is to domesticate medicinal plants needs prior investigation of the suitability of the agroecological factors in the new habitat. In absence of specifc markers, whole fngerprints can be used to demonstrate similarity phytoequivalence of medicinal plant varieties and also study the efect of changes in cultivation, harvesting, and postharvest handling practices [51].
To reduce the complexity and cost of analysing many markers in plant materials, the Chinese pharmacopoeia recommends analysis of common fngerprints, which are constructed peaks that are common to all samples (same retention times) [52]. Common fngerprints are also easier to reproduce than whole fngerprints. For this study, we obtained ten common peaks (Figure4); their retention times and peak areas are shown in Table S4. Te variety "Kawanda wide," which had the highest number of peaks, was used as the reference in selection and matching of peaks. Te total area of common fngerprint peaks ranged from 3,710,796 mVmins to 961,454 mVmins with an average of 1,652,214 ± 652910 mVmins. Te variations in peak areas are proportional to the variations in concentration of the compounds responsible for the peaks and thus show the variability of the samples. Te pattern of the peaks is characteristic of the plant material for the specifed analysis conditions and so can be used to identify and authenticate the materials. Te fngerprints we developed were reproducible ( Figures S1 and S2). In absence of markers or if the chemical composition of the material is not known, strong peaks (peak area more than 10% of the total area) are used as reference in describing relative positions and areas of other peaks [53]. For this study, only peak 10 (mangiferin, Rt = 15.828 mins) was a strong peak, making up more than 70% of total peak area (Table S5).    calculating similarity indices such as correlation coefcients (r), cosines (c), and Euclidean distances (ED), among others. We calculated r of the whole and common fngerprints to determine the similarity of the leaves from diferent M. indica varieties (note: the whole fngerprints consist of all peaks in the chromatogram of each samples, while the common fngerprints constitute only peaks that are "common" to all the chromatograms of diferent samples).

Fingerprint Similarity Analysis of Fingerprints
Te whole fngerprints showed low correlation, with only a few fngerprints having r > 0.8 ( Figure 6).
On the other hand, the Pearson correlation analysis of common fngerprints showed that all the Mangifera indica leaf varieties were signifcantly similar (r > 0.9, P < 0.001), as shown in Figure 7.
Tis information can be used to guide the manufacturer in selecting phytoequivalent materials. According to the Chinese pharmacopoeia, only samples with r ≥ 0.9 are considered identical and therefore phytoequivalent; such materials can be substituted without signifcantly altering the chemical composition and thus the potency of product [52]. Tis approach is more accurate than analysing just a few markers [54], Hence, from Figure 6, the following varieties are equivalent: Apple mango � Doodo red � MPI; Asante � MPI � Doodo red; Bire � Boribo � Doodo red � Koona � Ngoogwe; Takataka � Kate; Kawanda green-� Sejjembe; Kawanda wide and Suu varieties have no substitutes. Although common fngerprint analysis gives higher r values (Figure 7), these fngerprints are based on only a few compounds. Terefore, it is crucial to study the plant material extensively to ensure that the selected peaks represent the most important active phytochemicals, in order to generate an accurate bioactivity fngerprint [31]. Nevertheless, common peak fngerprints are valuable in authenticating herbal raw materials or products [55].

Principal Component Analysis (PCA) of Common
Fingerprints. In the PCA scatter plot, all the Mangifera indica leaf varieties lie within the 95% ellipse apart from Kate (Figures 8 and 9).
Since it is not undesirable to have high a concentration of mangiferin, we did not eliminate Kate variety from the classifcation but rather assigned it as a separate group. Tus, from Figure 8, fve major groups are noticeable: A (Kate), B  Figures 10 and 11. Here, the loading plots for principal components 1 and 2 revealed that most of the sample variance is caused by two compounds, one at

Comparison of the HPLC Fingerprints of the M. indica Cultivars with Tose of Unidentifed Samples Collected from Various Parts of the Country (Field Samples).
To identify the groups of materials to which the feld samples relate to, the common fngerprints of unknown samples collected from ten diferent districts in Uganda were compared to the standardized fngerprints of the 15 cultivars. Te results indicated that most of the feld samples were close to Koona, Kate, and Kagoogwa green varieties. Te others were close to Takataka and Bire varieties as illustrated by Figure 12.
While the actual variety to which the feld samples belong can only be confrmed with genetic studies such as DNA bar coding, the results obtained give an insight into the most cultivated Mangifera indica varieties and thus the sources of the herbal materials. According to [23], Kate, Koona, Kawanda green, Takataka, and Bire are among the high fruit yielding varieties, hence grown by most households. Coincidentally, Kate, Koona, and Takataka varieties belong to group A (high in TPC, MC, and AOA), according to our grading, while Kawanda green belongs to group B. However, Bire belongs to the lowest group (D). Perhaps more samples are needed to further test the method and validate these results.

Conclusions
Tis study has demonstrated that the Mangifera indica leaf materials have relatively similar ethanolic extractive indices but difer in the total phenolic content and mangiferin concentration and thus antioxidant activity. Based on these parameters, we graded the raw materials and showed the varieties that can be substituted (those in the same quality grade) for production of medicines for respiratory tract disorders. Tese parameters and the ratios of their quantities can also be utilized to monitor the consistency in the quality of materials and the production process. In addition to the quality grades, we also developed HPLC fngerprints which can be utilized to authenticate the materials by demonstrating phytoequivalence at correlation coefcients greater than 0.9. For standardization purposes, where minimum required mangiferin marker concentrations are set, materials of diferent concentrations are used to standardize each other, i.e., dilution or concentration. Tis approach is preferred to addition of pure markers (e.g., mangiferin) to dilute samples or addition of inactive substances (excipients) to concentrated materials. Also, to develop a GMP-compliant botanical identifcation method (BIM), both "diluted" and "concentrated" materials are needed. Tat said, more sampling and testing are necessary to  cater for as much variability as possible. Tere is also a need to test materials growing in diferent agroecological regions of the country to cater for climatic infuence and generalize the application of the analytical parameters.

Data Availability
Te datasets used to support the fndings of this study are included within the supplementary information fle.

Ethical Approval
Tis work was approved by the Research and Ethics Committee of Mbarara University of Science and Technology.

Disclosure
Te funder contributed research money and stipends (SD) but was not involved in planning and implementation of the study.

Conflicts of Interest
Te authors declare that they have no conficts of interest. Table S1: relationship among the parameters. Table S2: Tukey's multiple comparisons of leaf TPC with source variety of Mangifera indica. Table S3: Tukey's multiple comparisons of leaf mangiferin concentration with the Mangifera indica variety. Table S4: validation of the HPLC method for quantifcation of mangiferin in Mangifera indica leaves. Table S5: characteristics of whole chromatogram fngerprints. Table S6: common peaks used to calculate fngerprints. Table S7: Tukey's multiple comparisons of leaf antioxidant activity with the Mangifera indica variety. Figure  S1: interday repeatability of Mangifera indica leaf fngerprints (samples for Koona variety were used). Figure